Chinese Dependency Parsing with Large Scale Automatically Constructed Case Structures

نویسندگان

  • Kun Yu
  • Daisuke Kawahara
  • Sadao Kurohashi
چکیده

This paper proposes an approach using large scale case structures, which are automatically constructed from both a small tagged corpus and a large raw corpus, to improve Chinese dependency parsing. The case structure proposed in this paper has two characteristics: (1) it relaxes the predicate of a case structure to be all types of words which behaves as a head; (2) it is not categorized by semantic roles but marked by the neighboring modifiers attached to a head. Experimental results based on Penn Chinese Treebank show the proposed approach achieved 87.26% on unlabeled attachment score, which significantly outperformed the baseline parser without using case structures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Treebank-Based Acquisition of Chinese LFG Resources for Parsing and Generation

This thesis describes a treebank-based approach to automatically acquire robust, wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena an...

متن کامل

Character-Level Chinese Dependency Parsing

Recent work on Chinese analysis has led to large-scale annotations of the internal structures of words, enabling characterlevel analysis of Chinese syntactic structures. In this paper, we investigate the problem of character-level Chinese dependency parsing, building dependency trees over characters. Character-level information can benefit downstream applications by offering flexible granularit...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Treebank-Based Acquisition of LFG Resources for Chinese

This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof-concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially exten...

متن کامل

Treebank-Based Acquisition of LFG Parsing Resources for French

Motivated by the expense in time and other resources to produce hand-crafted grammars, there has been increased interest in automatically obtained wide-coverage grammars from treebanks for natural language processing. In particular, recent years have seen the growth in interest in automatically obtained deep resources that can represent information absent from simple CFG-type structured treeban...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008